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Abstract 

Warfighters  have  benefited  significantly  from  the  enormous  advances  in  digital  technology  over 
the  past  several  decades.  In  contrast,  too  little  of  the  considerable  progress  in  neuroscience  has 
been  applied  to  improving  warfighter  performance .  We  believe  this  reflects  the  absence  of  digital 
technology  that  can  help  bridge  the  gap  between  neuroscience  and  digital  systems.  We  believe  this 
gap  might  be  filled  by  constructing  a  computational  model  of  the  neuro-co gnitive  activity  of  the 
warfighter .  We  propose  that  such  a  model  could  be  created  by  algorithms  applied  to 
measurements  of  brain  activity  obtained  using  functional  MRI.  Algorithmic  processing  of  these 
measurements  can  exploit  a  variety  of  statistical  machine  learning  methods  to  synthesize  a  new 
kind  of  neuro-cognitive  model,  which  we  call  neurometric  models.  These  executable  models  could 
be  incorporated  into  a  number  of  applications  for  assessing  and  improving  mental  performance , 
including  battlefield  training  and  treatment  of  disorders  such  as  PTSD.  The  long  term  goal  is  to 
enable  systems  that  can  better  adapt  to  the  warfighter  in  real-time  due  to  model-generated 
hypotheses  about  the  individual's  neuro-cognitive  state. 

Introduction 

While  fMRI  has  been  used  for  research  purposes  for  over  a  decade  and  a  half,  the  analysis  of 
the  data  produced  by  this  imaging  technology  has  been  primarily  for  the  benefit  of  neuro- 
cognitive  researchers  who  are  conducting  scientific  enquiry  into  how  all  brains  work  in  general.  It 
has  not  been  applied  in  any  significant  degree  to  improving  performance  of  the  individual.  The 
traditional  scientific  goals  has  lead  to  an  anatomically  oriented  approach:  attempts  are  made  to 
determine  what  functionality  individual  regions  of  the  brain  provide.  This  requires  a  high  degree 
of  precision  that  current  fMRI  technology  has  difficulty  providing,  and  is  compounded  by  the 
variation  among  individuals. 

Whereas  this  standard  approach  uses  anatomy  as  the  fundamental  frame  of  reference,  we  will 
take  a  different  approach,  one  that  instead  utilizes  computation  as  the  fundamental  frame  of 
reference.  Our  proposed  schema  will  transform  measurements  of  brain  activity  algorithmically 
and  automatically  into  an  abstract  neuro-cognitive  computational  model  of  simple  tasks  being 
performed  by  individuals.  We  call  such  measurement-driven  models,  Neurometric  Models. 

While  computational  neuroscience  has  been  pursuing  modeling  of  the  brain  for  several 
decades,  many  of  these  efforts  have  been  1)  anatomically  based,  2)  concerned  with  brains  in 
general,  3)  are  spatially  oriented,  and  4)  have  built  software  models  manually.  In  contrast,  our 
approach  will  be  1)  computationally  based,  2)  targets  modeling  a  given  individual's  brain,  3)  will 
emphasize  the  temporal  domain,  and  4)  is  synthesized  automatically.  This  last  point  is  critical 
given  our  goal  of  modeling  individual  brains,  because  the  cost  of  manual  model  construction 
would  most  likely  be  prohibitive  otherwise. 

Our  approach  builds  on  recent  work  that  uses  pattern  recognition  algorithms  applied  to  fMRI 
images  to  identify  brain  states.  The  brain  states  will  be  limited  to  those  that  occur  while  an 
individual  is  performing  a  task  of  interest,  such  as  those  taught  using  virtual-world  based  training 
systems.  No  attempt  will  be  made  to  model  brain  functionality  in  general,  which  is  presently  far 
too  ambitious.  The  specific  anatomical  distribution  of  brain  activity  will  be  captured  by  a  pattern 
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recognizer  that  is  created  using  artificial  neural  networks  trained  on  the  fMRI  data. 

Experimental  Design 

For  our  stimuli,  we  used  3D  virtual  environments  like  those  used  in  training  simulators.  Each 
stimulus  was  created  using  Unreal  Development  Kit  3.0,  which  is  a  combination  authoring  and 
rendering  system  used  for  commercial  game  and  simulation  products.  Using  virtual  environments, 
as  opposed  to  photographs  or  drawings,  exploits  the  brain's  natural  design  for  operating  in  a  3D 
environment.  Compared  to  video,  virtual  environments  enable  interactivity  that  is  needed  for 
performing  a  task.  Virtual  environments  also  provide  a  high  degree  of  control  over  the  design  of 
the  stimuli,  as  well  as  complete  knowledge  of  its  contents. 

For  this  study,  we  created  a  small  virtual  town  suggestive  of  those  encountered  in  current 
Middle  East  combat  zones.  We  introduced  into  this  environment  three  categories  of  characters: 
soldiers,  insurgents,  and  indigenous  civilians  (see  below).  The  view  for  the  virtual  camera  was 
chosen  to  be  1st  person,  as  is  typical  of  most  training  systems,  in  order  to  create  the  impression  of 
being  an  agent  in  the  environment.  We  created  a  very  simple  scenario  suggestive  of  searching  for 
snipers.  We  alternated  between  moving  the  viewer  through  pasts  of  the  town  in  which  there  were 
no  characters  present  (searching),  and  stopping  at  certain  locations  where  characters  appeared  in 
varying  combinations  (encountering).  The  characters  and  the  viewer  were  always  exhibiting  slight 
motion,  so  at  no  time  was  there  static  imagery. 

We  chose  a  mixed  block  design  for  our  initial  experiments.  In  many  examples  of  block  designs 
for  cognitive  neuroscience,  there  is  an  alternation  between  presenting  the  desired  stimulus  and 
displaying  a  blank  screen  (perhaps  with  a  cross-hair  to  give  the  subject  something  to  focus  on). 
This  technique  maximizes  the  contrast  when  a  single  simple  task  or  stimulus  is  to  be  quantified.  In 
our  case,  such  a  radical  alternation  would  be  alien  to  the  example  application  of  training.  Rather, 
we  chose  to  always  maintain  the  experience  of  being  in  the  virtual  world  with  continuous  motion. 
Our  use  of  block  design  alternates  between  the  two  modes:  1)  moving  through  the  town  with  no 
characters  visible  (searching),  2)  remaining  in  a  single  location  in  the  town  with  a  mix  of 
characters  directly  in  front  of  the  viewer  (encountering).  Each  of  the  two  phases  was  15  seconds 
long,  for  a  total  of  30  seconds  for  each  character  combination.  The  nature  of  the  encounters 
varied  from  block  to  block,  but  the  retention  of  a  block  design  structure  was  chosen  to  enable 
selection  of  responsive  voxels  in  the  fMRI  data. 


1  out  of  16  combinations  used  for  CONSTANT  stimulus 
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We  used  two  stimulus  conditions.  One  was  designed  to  test  whether  the  neural  network  could  be 
trained  to  count  the  number  of  characters  in  the  scene  when  all  the  characters  were  of  a  single 
type:  either  soldiers  or  insurgents.  The  second  was  designed  to  evoke  a  variable  level  of  threat  by 
presenting  a  mix  of  soldiers,  insurgents  and  civilians.  The  number  of  characters  in  the  first 
condition  varied  from  1-6,  where  as  the  total  number  of  characters  in  this  second  condition  was 
held  constant  at  6  in  order  to  control  for  human  vision  processing  that  correlated  only  with  the 
number  of  characters.  We  refer  to  the  first  condition  as  VARYING  (because  of  the  varying 
number  of  characters)  and  the  second  as  CONSTANT  (because  of  the  constant  number  of 
characters).  For  VARYING,  all  1  to  6  possible  cases  were  present  twice  in  random  order  within  a 
single  scan.  For  CONSTANT,  the  number  of  soldiers  ranged  from  0  to  3,  and  similarly  for  the 
number  of  insurgents.  The  number  of  civilians  =  6  -  (#  soldiers  +  #  insurgents  ).  With  0:3  X  0:3 
possible  mixes,  there  were  a  total  of  16  combinations.  The  order  in  which  the  combinations  was 
presented  to  the  subject  were  always  a  random  permutation. 

Data  Acquisition 

The  fMRI  data  was  obtained  using  a  GE  Signa  3.0T  scanner,  located  at  the  U.  of  Texas  at  Austin 
Imaging  Research  Center.  Sessions  began  by  obtaining  a  Tl-wieghted  anatomy  on  the  same  slice 
prescription  as  the  fMRI  data  using  a  SPGR  sequence.  The  fMRI  data  was  acquired  using  a  T2*- 
sensitive  EPI  sequence.  Image  quality  was  improved  by  using  an  8-channel  head-coil  array 
combined  with  a  GRAPPA  parallel  imaging  scheme.  The  GRAPPA  speed-up  factor  was  3:1  to 
obtain  whole-brain  volumes  of  36-44  slices  every  2  seconds,  with  a  cubic  voxel  size  of  2.5mm  per 
side.  The  first  12  seconds  of  data  was  discarded  to  mitigate  transient  effects.  The  data  was 
converted  to  a  per  voxel  time  series  format.  This  data  was  then  motion  corrected,  using  a  rigid 
body  transformation,  first  within  each  scan  and  then  between  successive  scans.  Finally,  a  timing 
correction  was  applied  to  compensate  for  the  interleaved  slice  acquisition. 

We  then  selected  a  relatively  small  subset  of  the  voxels  to  use  as  inputs  to  the  neural  network 
(a.k.a.  feature  selection).  This  selection  was  based  on  the  periodic  nature  of  the  block  design. 
Responses  to  a  periodic  stimulation,  regardless  of  its  form,  will  show  power  only  at  the 
fundamental  and  harmonics  of  the  stimulation.  We  performed  a  harmonic  analysis  in  which  we 
summed  the  time-series  power  present  at  the  fundamental  and  harmonics  2—4.  Those  voxels  with 
a  fractional  power  greater  than  a  particular  threshold  were  selected  for  the  next  stage  of 
processing.  The  power  threshold  was  chosen  to  select  a  fixed  number  of  voxels,  typically  ~3000. 

Results 

Both  of  our  stimulus  conditions  produced  statistically  significant  activity  in  a  variety  of  brain 
regions.  Similar  patterns  of  activity  were  also  obtained  by  more  conventional  forms  of  analysis 
such  as  correlation  with  a  best-fit  sinusoid  at  the  block-stimulus  frequency.  We  found  clusters  of 
activity  in  frontal  lobes,  posterior  parietal  lobes,  and  regions  of  ventral  occipital  cortex  often 
associated  with  object  selectivity. 

In  this  pilot  study,  our  objective  was  to  assess  the  viability  of  using  neural  networks  (NN)  to 
identify  which  brain  activation  patterns  were  indicative  of  certain  simple  characteristics  of  a 
dynamic  virtual  world.  To  our  knowledge,  this  has  never  been  done  before.  As  noted  above,  we 
targeted  two  conditions:  counting  the  number  of  characters  and  assessing  threat  level.  In  addition, 
mostly  as  a  sanity  check,  we  built  NNs  to  distinguish  between  scenes  of  the  town  with  and  without 
characters  in  it.  Because  we  selected  voxels  based  on  their  correlation  to  alternating  between  these 
two  cases,  this  presented  a  best  case  scenario  for  NNs.  We  used  Matlab  2009  64-bit  with  the  Neural 
Network  Toolbox  running  on  Linux  OS  for  all  of  our  results. 

For  this  study,  we  restricted  the  class  of  NN  to  the  feed-forward  variety  (the  most  common). 
These  have  a  fixed  structure  characterized  by  a)  the  number  of  layers,  b)  the  number  of  nodes  in 
each  layer,  and  c)  the  transfer  function  for  each  layer.  Given  an  instance  of  this  fixed  structure, 
chosen  from  an  endless  number  of  possible  such  structures,  the  only  variable  components  of  the 
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network  are  the  weights  at  each  node  in  the  network.  It  is  the  adjustment  of  these  weights  to  fit  the 
example  data  that  constitutes  the  “learning”  process.  This  is  a  mathematical  optimization 
problem,  which  like  essentially  all  such  problems,  does  not  succumb  to  a  direct  solution.  Rather  an 
iterative  search  must  be  perform  over  the  space  of  all  possible  weights.  The  search  is  guided  by 
measuring  the  error  between  the  network’s  current  output  and  the  target  output,  which  is  given  as 
part  of  the  training  set.  The  process  gradually  minimizes  this  error  by  traveling  along  the  surface 
of  the  error  function  in  a  direction  that  reduces  the  error.  However,  the  error  function  most  always 
has  many  local  minima  into  which  the  process  will  become  trapped,  so  some  additional  element 
needs  to  be  introduced  to  find  the  best  local  minima  from  a  set  of  such.  While  simulated  annealing 
is  a  general  approach  to  this  kind  of  problem,  it  is  not  provided  by  the  NN  Toolbox.  Instead,  we 
run  the  process  multiple  times,  starting  each  run  with  a  different  set  of  randomly  chosen  initial 
weights.  We  then  evaluate  the  goodness  of  each  generated  network  using  the  validation  set  and 
keep  the  best  one.  As  a  basis  of  comparison,  we  calculate  the  average  performance  for  the  NN 
structure  that  yielded  the  best  performance.  In  addition  to  classification,  we  also  trained  a  network 
to  give  a  continuous  value  output  between  0-6  for  counting  characters.  The  performance  for  this 
kind  of  regression  is  given  by  an  R- Value,  where  R  =  1 .0  is  comparable  to  100%  correct.  This  is 
keeping  with  our  plans  to  build  a  multi-dimensional  representation  of  neuro-state-space. 

Here  is  a  table  of  the  results.  Note  the  improvement  from  Subject  A  to  the  subsequence  two 
subjects  who  where  scanned  several  months  after  Subject  A.  This  reflects  our  improvement  of  the 
fMRI  protocol  and  data-processing. 
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Target  Class 

CONSTANT  A:  Characters  Yes/No 


Confusion  Matnx 


41 

0 

0 

2 

95.3% 

53.2% 

0.0% 

0.0% 

2.6% 

4.7% 

0 

4 

0 

1 

80.0% 

0.0% 

5.2% 

0.0% 

1.3% 

20.0% 

1 

0 

9 

1 

81.8% 

1.3% 

0.0% 

11.7% 

1.3% 

18.2% 

0 

0 

3 

15 

83.3% 

0.0% 

0.0% 

3.9% 

19.5% 

16.7% 

97.6% 

100% 

75.0% 

78.9% 

89.6% 

2.4% 

0.0% 

25.0% 

21.1% 

10.4% 

1 

2 

3 

4 

Target  Class 


CONSTANT  A:  Threat  level 
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A  more  detailed  graphical  presentation  of  the  best  data  for  subject  A  is  shown  above  as  a 
“confusion  matrix”.  A  confusion  matrix  is  used  to  display  the  effectiveness  of  solving 
classification  problems.  It  shows  for  each  classification  on  test  inputs,  which  classification  was 
assigned  to  it  by  the  NN.  With  a  perfect  classifier,  only  the  diagonal  elements  would  contain 
counts  of  inputs.  For  example,  in  the  VARYING  A:  Counting,  the  Target  class  #  =  #  characters-1 . 
Column  3  shows  that  12  inputs  of  brain  volumes,  sampled  when  2  characters  were  presented,  were 
classified  correctly,  while  2  where  misclassified  as  4  characters,  resulting  in  a  85.7%  success  rate 
(bottom  row).  The  lower  right  element  shows  the  overall  performance  of  94.8%. 


Confusion  Matnx 


Contusion  Matrix 


56 

48.7% 

0 

0.0% 

100% 

0.0% 

1 

58 

98.3% 

0.9% 

50.4% 

1.7% 

98.2% 

1 00% 

99.1% 

1.8% 

0.0% 

0.9% 

1  2 

Target  Class 


57 

0 

0 

0 

0 

0 

0 

100% 

49.6% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

3 

7 

0 

0 

0 

0 

0 

70.0% 

2.6% 

6.1% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

30.0% 

0 

0 

12 

0 

0 

0 

0 

100% 

0.0% 

0.0% 

10.4% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

0 

0 

0 

9 

0 

0 

0 

100% 

0.0% 

0.0% 

0.0% 

7.8% 

0.0% 

0.0% 

0.0% 

0.0% 

0 

0 

2 

0 

a 

0 

1 

72.7% 

0.0% 

0.0% 

1.7% 

0.0% 

7.0% 

0.0% 

0.9% 

27.3% 

0 

0 

0 

0 

0 

8 

0 

100% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

7.0% 

0.0% 

0.0% 

0 

0 

0 

0 

0 

0 

B 

100% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

7.0% 

0.0% 

95.0% 

100% 

85.7% 

100% 

100% 

100% 

88.9% 

94.8% 

5.0% 

0.0% 

14.3% 

0.0% 

0.0% 

0.0% 

11.1% 

5.2% 

1 

2 

3 

4 

5 

6 

7 

Target  Class 


VARYING  A:  Characters  Yes/No 


VARYING  A:  Counting 


For  the  VARIABLE  condition,  we  also  performed  a  sensitivity  analysis  as  a  means  to  estimate 
which  of  the  input  voxel  time  series  most  strongly  influenced  the  outputs  of  the  NN.  Starting  from 
the  mean  input  state  of  the  system,  each  input  was  perturbed  by  a  small  amount,  the  effect  of  the 
perturbation  upon  the  output  state  was  noted.  This  process  was  repeated  for  all  input  voxels.  These 
sensitivity  values  were  then  normalized  to  their  observed  maximum  across  voxels,  and  the  results 
are  visualized  below  (next  page).  Only  those  voxels  with  sensitivities  >10%  are  shown.  Note  that 
the  network  is  very  selective:  only  ~300  of  the  original  2800  voxel  time  series  are  given  a  weight 
>10%.  Note  also  that  these  highly-weighted  voxels  occur  exclusively  in  the  gray  matter  of  the 
brain.  Finally,  we  observe  that  the  majority  of  the  highly  weighted  voxels  are  clustered  in  ventral 
occipital  regions  that  are  probably  part  of  object-selective  visual  cortex,  and  in  posterior  parietal 
regions  that  probably  have  visual  attention  and  association.  It  is  also  worth  noting  that  the  NN 
gave  little  weight  to  any  frontal  brain  regions.  Similar  results  were  observed  in  Subject  B,  except 
there  was  less  activation  of  posterior  parietal  regions.  Thus,  this  analysis  demonstrates  the  efficacy 
of  NNs  in  discerning  task-or-stimulus-relevant  fMRI  inputs,  and  provides  us  a  means  to  relate 
these  associations  back  to  individual  brain  anatomy. 


5 


Network  sensitivity 


Sensitivity  analysis  results  for  Subject  A 


Discussion 

The  performance  of  the  classifiers  when  compared  to  chance  are  excellent  (see  Table  1).  The 
performance  of  100%  for  counting,  both  for  classification  and  regression,  on  more  recent  subjects 
is  extremely  encouraging.  The  better  performance  for  Counting  than  Threat-level  probably 
reflects  the  fact  that  assigning  counting  categories  to  stimuli  is  objective  and  that  counting  is  a 
low-level  cognitive  activity.  For  threat  level,  the  assignment  was  only  quasi-objective,  being  based 
counting  the  number  of  friends  vs.  foes  while  the  total  number  of  characters  was  held  constant. 

Our  approach  to  sensitivity  analysis  as  applied  to  multivariate/voxel  pattern  analysis  (MVPA)  is, 
as  far  as  we  know,  novel.  It  could  prove  to  be  an  important  new  technique  for  characterizing  which 
regions  of  the  brain  contribute  the  most  to  particular  cognitive  processing.  It  underscores  the 
importance  of  using  whole  brain  data  rather  than  regions  of  interest.  Indeed,  the  technique  tell  us 
which  regions  are  the  most  relevant  to  the  cognitive  computations.  The  technique  could  also  be 
used  as  part  of  the  feature  selection  process  by  iteratively  using  the  sensitivity  to  select  voxels  for 
subsequence  construction  of  NNs. 

To  summarize,  we  have  demonstrated  the  following: 

1)  Virtual  Worlds  like  those  used  in  military  training  can  be  used  effectively  as  fMRI  stimulus 

2)  Neural  Networks  can  be  constructed  that  can  reliably  identify  brain  activation  patterns 

distinguishing  the  number  of  characters  in  the  scene. 

3)  A  variety  of  mixes  of  Soldiers,  Terrorists  and  Civilians  can  be  classified  reliably  into  threat 

level  by  an  NN. 
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